8 research outputs found

    BagIt Fixer-Upper: Scaling BagIt Tools to Manage the Ingest of Petabytes of Digitization Work

    No full text
    The New York Public Library has created over 1.5 PB of files from digitizing over 50,000 audio and video items for the long-term preservation of their content. This paper details the Library’s usage of the BagIt File Packaging Format during Quality Assurance and Audit Submissions functions as defined by OAIS. It also discusses extensions of the bagit-python library in order repair bags that do not pass those functions. Working with thousands of terabytes stored in hundreds of thou- sands of bags requires that our approaches to ingest scale appro- priately. Common changes to bags such as the accidental creation of system files in bags or purposeful edits of metadata files will invalidate the entire bag. Noting and responding to these errors is critical for improving workflows, but manual response is impos- sible. Using the bagit-python library, NYPL has created tools to selectively clean system files from bag directories and manifests, update or add checksums, and create event logs of repairs

    What is the Standard Format for Digitized Audio? - iPRES 2019 Amsterdam

    No full text
    The best practices for representing analog audio with digital bitstreams are relatively clear. Sample the signal with 24 bits of resolution at 96KHz. The standards for storing the data are less clear, especially for media with complex configurations of faces, regions, and streams. Whether accomplished through metadata and/or file format, the strategy chosen to represent the complexity of the original media has long-term preservation implications. Best practice guides rarely document these edge cases and informal discussions with practitioners have revealed a wide range of practices. This paper aims to outline the specific challenges of representing complex audio objects after digitization and approaches that have been implemented but not widely adopted

    Assessing High-volume Transfers from Optical Media at NYPL

    No full text
    NYPL’s workflow for transferring optical media to long-term storage was met with a challenge: an acquisition of a collection containing thousands of recordable CDs and DVDs. Many programs take a disk-by-disk approach to imaging or transferring optical media, but to deal with a collection of this size, NYPL developed a workflow using a Nimbie AutoLoader and a customized version of KBNL’s open-source IROMLAB software to batch disks for transfer. This workflow prioritized quantity, but, at the outset, it was difficult to tell if every transfer was as accurate as it could be. We discuss the process of evaluating the success of the mass transfer workflow, and the improvements we made to identify and troubleshoot errors that could occur during the transfer. A background of the institution and other institutions’ approaches to similar projects is given, then an in-depth discussion of the process of gathering and analyzing data. We finish with a discussion of our takeaways from the project

    Open Preservation Foundation Community Survey 2015

    No full text
    This poster will present the headline results from the Open Preservation Community Survey 2015, which surveyed over 130 institutions around the world to establish the current state of the art in digital preservation practice. The survey focused on technology adoption and real-world infrastructure and architectures, including demographics about the type and size of the responding institution. The responses include: staff roles and allocations; core digital preservation activities; content types accepted for long-term management; storage capacity and models; use of the cloud and consortial solutions; use of open source; repository and workflow systems; and tool adoption. The survey did not ask about policies or costs. In addition, comparisons are drawn with the PLANETS survey from 2009 to show changes in requirements and practice over time. The published analysis and raw data will be forthcoming by the end of 2015
    corecore